Modeling Semantic Compositionality of Croatian Multiword Expressions

نویسندگان

  • Jan Snajder
  • Petra Almic
چکیده

A distinguishing feature of many multiword expressions (MWEs) is their semantic non-compositionality. Determining the semantic compositionality of MWEs is important for many natural language processing tasks. We address the task of modeling semantic compositionality of Croatian MWEs. We adopt a composition-based approach within the distributional semantics framework. We build and evaluate models based on Latent Semantic Analysis and the recently proposed neural network-based Skip-gram model, and experiment with different composition functions. We show that the compositionality scores predicted by the Skip-gram additive models correlate well with human judgments (ρ=0.50). When framed as a classification task, the model achieves an accuracy of 0.64.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Determining the Semantic Compositionality of Croatian Multiword Expressions

A distinguishing feature of many multiword expressions (MWEs) is their semantic non-compositionality. Being able to automatically determine the semantic (non-)compositionality of MWEs is important for many natural language processing tasks. We address the task of determining the semantic compositionality of Croatian MWEs. We adopt a composition-based approach within the distributional semantics...

متن کامل

Learning Semantic Composition to Detect Non-compositionality of Multiword Expressions

Non-compositionality of multiword expressions is an intriguing problem that can be the source of error in a variety of NLP tasks such as language generation, machine translation and word sense disambiguation. We present methods of non-compositionality detection for English noun compounds using the unsupervised learning of a semantic composition function. Compounds which are not well modeled by ...

متن کامل

Compositionality And Multiword Expressions: Six Of One, Half A Dozen Of The Other?

In this talk, I will investigate the relationship between compositionality and multiword expressions, as part of which I will outline different approaches for formalising the notion of compositionality. I will then briefly review computational methods that have been proposed for modelling compositionality, and applications thereof. Finally, I will discuss possible future directions for modellin...

متن کامل

A Word Embedding Approach to Predicting the Compositionality of Multiword Expressions

This paper presents the first attempt to use word embeddings to predict the compositionality of multiword expressions. We consider both singleand multi-prototype word embeddings. Experimental results show that, in combination with a back-off method based on string similarity, word embeddings outperform a method using count-based distributional similarity. Our best results are competitive with, ...

متن کامل

Combining Linguistic Features for the Detection of Croatian Multiword Expressions

As multiword expressions (MWEs) exhibit a range of idiosyncrasies, their automatic detection warrants the use of many different features. Tsvetkov and Wintner (2014) proposed a Bayesian network model that combines linguistically motivated features and also models their interactions. In this paper, we extend their model with new features and apply it to Croatian, a morphologically complex and a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Informatica (Slovenia)

دوره 39  شماره 

صفحات  -

تاریخ انتشار 2015